Finding sequential matches in eye tracking data

ABSTRACT

Embodiments of the invention provide systems and methods for analyzing eye tracking data. The eye tracking data can represent a number of different scanpaths and can be analyzed, for example, to find patterns or commonality between the scanpaths. According to one embodiment, a method of analyzing eye tracking data can comprise receiving the eye tracking data which can include a plurality of scanpaths, each scanpath representing a sequence of regions of interest on a stimulus image. A dotplot can be generated and can include each of the plurality of scanpaths. One or more patterns within the eye tracking data can be identified based on the dotplot.

CROSS-REFERENCES TO RELATED APPLICATIONS

The present application claims benefit under 35 USC 119(e) of U.S.Provisional Application No. 61/113,538, filed on Nov. 11, 2008, entitled“Techniques For Analyzing Paths,” the entire contents of which areincorporated herein by reference for all purposes. The presentapplication is also related to U.S. patent application Ser. No. ______(Attorney Docket Number 021756-060810US) entitled “Using Dotplots forComparing and Finding Patterns in Sequences of Data Points” and U.S.patent application Ser. No. ______ (Attorney Docket Number021756-060830US) entitled “Time Expansion for Displaying PathInformation” both of which are filed concurrently herewith andincorporated herein by reference for all purposes.

BACKGROUND

Embodiments of the present invention relate to analyzing sequentialdata, and more specifically to analyzing eye tracking data representinga plurality of scanpaths.

Analysis of paths is performed in various different fields or domains.For example, in eye tracking analysis, scanpaths representing users' eyemovements while viewing a scene may be analyzed to determine high-levelscanning strategies. The scanning strategies determined from such ananalysis may be used to improve product designs. For example, bystudying scanpaths for users viewing a web page, common viewing trendsmay be determined and used to improve the web page layout. Various othertypes of analyses on paths may be performed in other fields.Accordingly, new and improved techniques are always desirable foranalyzing and displaying path-related information that can provideinsight into characteristics of the path and that facilitate comparisonsof paths.

BRIEF SUMMARY

Embodiments of the invention provide systems and methods for analyzingsequential data representing paths such as eye tracking data includingscanpaths representing users' eye movements while viewing a stimulusimage or other scene. The eye tracking data can represent a number ofdifferent scanpaths and can be analyzed, for example, to find patternsor commonality between the scanpaths. According to one embodiment, amethod of analyzing eye tracking data can comprise receiving the eyetracking data which can include matches between a plurality ofscanpaths, each scanpath representing a sequence of fixations, e.g.,within regions of interest on a stimulus image. A dotplot can begenerated representing matches between each of the plurality ofscanpaths. One or more patterns within the eye tracking data can beidentified based on the dotplot.

Stated another way, a method for analyzing eye tracking data cancomprise receiving the eye tracking data. The eye tracking data cancomprise a plurality of scanpaths, each of the plurality of scanpathsrepresenting a sequence of visual fixations on a stimulus image. Asequence of tokens corresponding to the sequence of visual fixations canbe generated. For example, each token of the sequence of tokenscorresponding to the sequence of visual fixations can comprise a regionname identifying one of a plurality of regions of interest of thestimulus image in which the corresponding visual fixation is located. Adotplot can be generated using the sequence of tokens. One or morepatterns of sequentially matching tokens within the eye tracking datacan be identified based on the dotplot. For example, identifying one ormore patterns can comprise identifying linear relationships within theplurality of scanpaths, for example, based on a linear regression.

In some cases, two or more scanpaths of the plurality of scanpaths canbe aggregated, i.e., a scanpath representing two matching or partiallymatching scanpaths can be generated, based on the identified one or morepatterns. A representation of the aggregated two or more scanpaths canbe displayed. For example, the representation of the aggregated two ormore scanpaths can comprise a graphical representation of the stimulusimage including an indication of the aggregated two or more scanpaths.

According to another embodiment, a system for analyzing eye trackingdata can comprise a processor and a memory communicatively coupled withand readable by the processor. The memory can have stored therein aseries of instructions which, when executed by the processor, cause theprocessor to receive the eye tracking data. The eye tracking data cancomprise a plurality of scanpaths, each of the plurality of scanpathsrepresenting a sequence of visual fixations on a stimulus image. Asequence of tokens corresponding to the sequence of visual fixations canbe generated. For example, each token of the sequence of tokenscorresponding to the sequence of visual fixations can comprise a regionname identifying one of a plurality of regions of interest of thestimulus image in which the corresponding visual fixation is located. Adotplot can be generated using the sequence of tokens. One or morepatterns of sequentially matching tokens within the eye tracking datacan be identified based on the dotplot. For example, identifying one ormore patterns can comprise identifying linear relationships within theplurality of scanpaths, for example, based on a linear regression.

In some cases, the instructions may further cause the processor toaggregate two or more scanpaths of the plurality of scanpaths, i.e., ascanpath representing two matching or partially matching scanpaths canbe generated, based on the identified one or more patterns. Arepresentation of the aggregated two or more scanpaths can be displayed.For example, the representation of the aggregated two or more scanpathscan comprise a graphical representation of the stimulus image includingan indication of the aggregated two or more scanpaths.

According to yet another embodiment, a machine-readable medium can havestored thereon a series of instructions which, when executed by aprocessor, cause the processor to analyze eye tracking data by receivingthe eye tracking data. The eye tracking data can comprise a plurality ofscanpaths, each of the plurality of scanpaths representing a sequence ofvisual fixations on a stimulus image. A sequence of tokens correspondingto the sequence of visual fixations can be generated. For example, eachtoken of the sequence of tokens corresponding to the sequence of visualfixations can comprise a region name identifying one of a plurality ofregions of interest of the stimulus image in which the correspondingvisual fixation is located. A dotplot can be generated using thesequence of tokens. One or more patterns of sequentially matching tokenswithin the eye tracking data can be identified based on the dotplot. Forexample, identifying one or more patterns can comprise identifyinglinear relationships within the plurality of scanpaths, for example,based on a linear regression.

In some cases, two or more scanpaths of the plurality of scanpaths canbe aggregated, i.e., a scanpath representing two matching or partiallymatching scanpaths can be generated, based on the identified one or morepatterns. A representation of the aggregated two or more scanpaths canbe displayed. For example, the representation of the aggregated two ormore scanpaths can comprise a graphical representation of the stimulusimage including an indication of the aggregated two or more scanpaths.

BRIEF DESCRIPTION OF THE DRAWINGS

FIG. 1 is a block diagram illustrating components of an exemplaryoperating environment in which various embodiments of the presentinvention may be implemented.

FIG. 2 is a block diagram illustrating an exemplary computer system inwhich embodiments of the present invention may be implemented.

FIG. 3 is a block diagram illustrating, at a high-level, functionalcomponents of a system for analyzing eye tracking data according to oneembodiment of the present invention.

FIG. 4 illustrates an exemplary stimulus image of a user interface whichmay be used with embodiments of the present invention and a number ofexemplary scanpaths.

FIG. 5 is chart illustrating an exemplary dotplot for sequences of dataaccording to one embodiment of the present invention.

FIG. 6 illustrates an exemplary stimulus image with aggregated scanpathsdisplayed thereon according to one embodiment of the present invention.

FIG. 7 is a flowchart illustrating a process for analyzing eye trackingdata according to one embodiment of the present invention.

FIG. 8 is a block diagram illustrating an exemplary softwarearchitecture for implementing an eye tracking data analysis processaccording to one embodiment of the present invention.

FIG. 9 is a flowchart illustrating a linear regression process foridentifying patterns within a dotplot of sequential data according toone embodiment of the present invention.

DETAILED DESCRIPTION

In the following description, for the purposes of explanation, numerousspecific details are set forth in order to provide a thoroughunderstanding of various embodiments of the present invention. It willbe apparent, however, to one skilled in the art that embodiments of thepresent invention may be practiced without some of these specificdetails. In other instances, well-known structures and devices are shownin block diagram form.

The ensuing description provides exemplary embodiments only, and is notintended to limit the scope, applicability, or configuration of thedisclosure. Rather, the ensuing description of the exemplary embodimentswill provide those skilled in the art with an enabling description forimplementing an exemplary embodiment. It should be understood thatvarious changes may be made in the function and arrangement of elementswithout departing from the spirit and scope of the invention as setforth in the appended claims.

Specific details are given in the following description to provide athorough understanding of the embodiments. However, it will beunderstood by one of ordinary skill in the art that the embodiments maybe practiced without these specific details. For example, circuits,systems, networks, processes, and other components may be shown ascomponents in block diagram form in order not to obscure the embodimentsin unnecessary detail. In other instances, well-known circuits,processes, algorithms, structures, and techniques may be shown withoutunnecessary detail in order to avoid obscuring the embodiments.

Also, it is noted that individual embodiments may be described as aprocess which is depicted as a flowchart, a flow diagram, a data flowdiagram, a structure diagram, or a block diagram. Although a flowchartmay describe the operations as a sequential process, many of theoperations can be performed in parallel or concurrently. In addition,the order of the operations may be re-arranged. A process is terminatedwhen its operations are completed, but could have additional steps notincluded in a figure. A process may correspond to a method, a function,a procedure, a subroutine, a subprogram, etc. When a process correspondsto a function, its termination can correspond to a return of thefunction to the calling function or the main function.

The term “machine-readable medium” includes, but is not limited toportable or fixed storage devices, optical storage devices, wirelesschannels and various other mediums capable of storing, containing orcarrying instruction(s) and/or data. A code segment ormachine-executable instructions may represent a procedure, a function, asubprogram, a program, a routine, a subroutine, a module, a softwarepackage, a class, or any combination of instructions, data structures,or program statements. A code segment may be coupled to another codesegment or a hardware circuit by passing and/or receiving information,data, arguments, parameters, or memory contents. Information, arguments,parameters, data, etc. may be passed, forwarded, or transmitted via anysuitable means including memory sharing, message passing, token passing,network transmission, etc.

Furthermore, embodiments may be implemented by hardware, software,firmware, middleware, microcode, hardware description languages, or anycombination thereof. When implemented in software, firmware, middlewareor microcode, the program code or code segments to perform the necessarytasks may be stored in a machine readable medium. A processor(s) mayperform the necessary tasks.

Embodiments of the present invention provide systems and methods foranalyzing sequential data representing paths such as eye tracking dataincluding scanpaths representing users' eye movements while viewing astimulus image or other scene. As the term is used herein, a path may bedefined as a sequence of two or more points. The first point in thesequence of points may be referred to as the start point of the path andthe last point in the sequence may be referred to as the end point ofthe path. The portion of a path between any two consecutive points inthe sequence of points may be referred to as a path segment. A path maycomprise one or more segments.

A sequence may be any list of tokens or symbols in a particular order.Examples of sequences can include but are not limited to words in aquery, words in a document, symbols in a computer program's source code,scanpaths, i.e., sequences of eye tracking fixation points as determinedby an eye tracking system, sequences of requested URLs in a user's webbrowsing session, sequences of requested URLs in a web server's logfile, etc.

Thus, there are different types of paths considered to be within thescope of the term as used herein. Examples described below have beendescribed with reference to a specific type of path, referred to as ascanpath, which is used to track eye movements. A scanpath is a paththat an eye follows when viewing a scene. A scanpath is defined by asequence of fixation points (or gaze locations). A path segment betweentwo consecutive fixation points in the sequence of fixation points isreferred to as a saccade. A scanpath is thus a sequence of fixationpoints connected by saccades during scene viewing where the saccadesrepresent eye movements between fixation points. For purposes ofsimplicity, the scanpaths described below are two-dimensional paths. Theteachings of the present invention may however also be applied to pathsin multiple dimensions, greater than two.

However, it should be understood that, while embodiments of the presentinvention have been described in context of scanpaths, this is notintended to limit the scope of the present invention as recited in theclaims to scanpaths. Teachings of the present invention may also beapplied to other types of paths occurring in various different domainssuch as a stock price graph, a path followed by a car between a startand an end destination, and the like. Various additional details ofembodiments of the present invention will be described below withreference to the figures.

FIG. 1 is a block diagram illustrating components of an exemplaryoperating environment in which various embodiments of the presentinvention may be implemented. The system 100 can include one or moreuser computers 105, 110, which may be used to operate a client, whethera dedicate application, web browser, etc. The user computers 105, 110can be general purpose personal computers (including, merely by way ofexample, personal computers and/or laptop computers running variousversions of Microsoft Corp.'s Windows and/or Apple Corp.'s Macintoshoperating systems) and/or workstation computers running any of a varietyof commercially-available UNIX or UNIX-like operating systems (includingwithout limitation, the variety of GNU/Linux operating systems). Theseuser computers 105, 110 may also have any of a variety of applications,including one or more development systems, database client and/or serverapplications, and web browser applications. Alternatively, the usercomputers 105, 110 may be any other electronic device, such as athin-client computer, Internet-enabled mobile telephone, and/or personaldigital assistant, capable of communicating via a network (e.g., thenetwork 115 described below) and/or displaying and navigating web pagesor other types of electronic documents. Although the exemplary system100 is shown with two user computers, any number of user computers maybe supported.

In some embodiments, the system 100 may also include a network 115. Thenetwork may can be any type of network familiar to those skilled in theart that can support data communications using any of a variety ofcommercially-available protocols, including without limitation TCP/IP,SNA, IPX, AppleTalk, and the like. Merely by way of example, the network115 maybe a local area network (“LAN”), such as an Ethernet network, aToken-Ring network and/or the like; a wide-area network; a virtualnetwork, including without limitation a virtual private network (“VPN”);the Internet; an intranet; an extranet; a public switched telephonenetwork (“PSTN”); an infra-red network; a wireless network (e.g., anetwork operating under any of the IEEE 802.11 suite of protocols, theBluetooth protocol known in the art, and/or any other wirelessprotocol); and/or any combination of these and/or other networks such asGSM, GPRS, EDGE, UMTS, 3G, 2.5 G, CDMA, CDMA2000, WCDMA, EVDO etc.

The system may also include one or more server computers 120, 125, 130which can be general purpose computers and/or specialized servercomputers (including, merely by way of example, PC servers, UNIXservers, mid-range servers, mainframe computers rack-mounted servers,etc.). One or more of the servers (e.g., 130) may be dedicated torunning applications, such as a business application, a web server,application server, etc. Such servers may be used to process requestsfrom user computers 105, 110. The applications can also include anynumber of applications for controlling access to resources of theservers 120, 125, 130.

The web server can be running an operating system including any of thosediscussed above, as well as any commercially-available server operatingsystems. The web server can also run any of a variety of serverapplications and/or mid-tier applications, including HTTP servers, FTPservers, CGI servers, database servers, Java servers, businessapplications, and the like. The server(s) also may be one or morecomputers which can be capable of executing programs or scripts inresponse to the user computers 105, 110. As one example, a server mayexecute one or more web applications. The web application may beimplemented as one or more scripts or programs written in anyprogramming language, such as Java™, C, C# or C++, and/or any scriptinglanguage, such as Perl, Python, or TCL, as well as combinations of anyprogramming/scripting languages. The server(s) may also include databaseservers, including without limitation those commercially available fromOracle®, Microsoft®, Sybase®, IBM® and the like, which can processrequests from database clients running on a user computer 105, 110.

In some embodiments, an application server may create web pagesdynamically for displaying on an end-user (client) system. The web pagescreated by the web application server may be forwarded to a usercomputer 105 via a web server. Similarly, the web server can receive webpage requests and/or input data from a user computer and can forward theweb page requests and/or input data to an application and/or a databaseserver. Those skilled in the art will recognize that the functionsdescribed with respect to various types of servers may be performed by asingle server and/or a plurality of specialized servers, depending onimplementation-specific needs and parameters.

The system 100 may also include one or more databases 135. Thedatabase(s) 135 may reside in a variety of locations. By way of example,a database 135 may reside on a storage medium local to (and/or residentin) one or more of the computers 105, 110, 115, 125, 130. Alternatively,it may be remote from any or all of the computers 105, 110, 115, 125,130, and/or in communication (e.g., via the network 120) with one ormore of these. In a particular set of embodiments, the database 135 mayreside in a storage-area network (“SAN”) familiar to those skilled inthe art. Similarly, any necessary files for performing the functionsattributed to the computers 105, 110, 115, 125, 130 may be storedlocally on the respective computer and/or remotely, as appropriate. Inone set of embodiments, the database 135 may be a relational database,such as Oracle 10g, that is adapted to store, update, and retrieve datain response to SQL-formatted commands.

FIG. 2 illustrates an exemplary computer system 200, in which variousembodiments of the present invention may be implemented. The system 200may be used to implement any of the computer systems described above.The computer system 200 is shown comprising hardware elements that maybe electrically coupled via a bus 255. The hardware elements may includeone or more central processing units (CPUs) 205, one or more inputdevices 210 (e.g., a mouse, a keyboard, etc.), and one or more outputdevices 215 (e.g., a display device, a printer, etc.). The computersystem 200 may also include one or more storage device 220. By way ofexample, storage device(s) 220 may be disk drives, optical storagedevices, solid-state storage device such as a random access memory(“RAM”) and/or a read-only memory (“ROM”), which can be programmable,flash-updateable and/or the like.

The computer system 200 may additionally include a computer-readablestorage media reader 225 a, a communications system 230 (e.g., a modem,a network card (wireless or wired), an infra-red communication device,etc.), and working memory 240, which may include RAM and ROM devices asdescribed above. In some embodiments, the computer system 200 may alsoinclude a processing acceleration unit 235, which can include a DSP, aspecial-purpose processor and/or the like.

The computer-readable storage media reader 225 a can further beconnected to a computer-readable storage medium 225 b, together (and,optionally, in combination with storage device(s) 220) comprehensivelyrepresenting remote, local, fixed, and/or removable storage devices plusstorage media for temporarily and/or more permanently containingcomputer-readable information. The communications system 230 may permitdata to be exchanged with the network 220 and/or any other computerdescribed above with respect to the system 200.

The computer system 200 may also comprise software elements, shown asbeing currently located within a working memory 240, including anoperating system 245 and/or other code 250, such as an applicationprogram (which may be a client application, web browser, mid-tierapplication, RDBMS, etc.). It should be appreciated that alternateembodiments of a computer system 200 may have numerous variations fromthat described above. For example, customized hardware might also beused and/or particular elements might be implemented in hardware,software (including portable software, such as applets), or both.Further, connection to other computing devices such as networkinput/output devices may be employed. Software of computer system 200may include code 250 for implementing embodiments of the presentinvention as described herein.

As noted above, embodiments of the present invention provide foranalyzing sequential data representing paths such as eye tracking dataincluding scanpaths representing users' eye movements while viewing astimulus image or other scene. The eye tracking data can represent anumber of different scanpaths and can be analyzed, for example, to findpatterns or commonality between the scanpaths. According to oneembodiment, analyzing eye tracking data with a path analysis system suchas the computer system 200 described above can comprise receiving theeye tracking data at the path analysis system. The eye tracking data,which can be obtained by the system in a number of different ways aswill be described below, can include a plurality of scanpaths, eachscanpath representing a sequence of regions of interest on a scene suchas a stimulus image displayed by the system. A dotplot can be generatedby the system that representing matches between each of the plurality ofscanpaths. One or more patterns within the eye tracking data can then beidentified by the system based on the dotplot.

FIG. 3 is a block diagram illustrating, at a high-level, functionalcomponents of a system for analyzing eye tracking data according to oneembodiment of the present invention. In this example, the path analysissystem 300 comprises several components including a user interface 320,a renderer 330, and a path data analyzer 340. The various components maybe implemented in hardware, or software (e.g., code, instructions,program executed by a processor), or combinations thereof. Path analysissystem 300 may be coupled to a data store 350 that is configured tostore data related to processing performed by system 300. For example,path data (e.g., scanpath data) may be stored in data store 350.

User interface 320 provides an interface for receiving information froma user of path analysis system 300 and for outputting information frompath analysis system 300. For example, a user of path analysis system300 may enter path data 360 for a path to be analyzed via user interface320. Additionally or alternatively, a user of path analysis system 300may enter commands or instructions via user interface 320 to cause pathanalysis system 300 to obtain or receive path data 360 from anothersource. It should be noted, however, that a user interface is entirelyoptional to the present invention, which does not rely on the existenceof a user interface in any way.

System 300 may additionally or alternatively receive path data 360 fromvarious other sources. In one embodiment, the path data may be receivedfrom sources such as from an eye tracker device. For example,information regarding the fixation points and saccadic eye movementsbetween the fixation points, i.e., path data 360, may be gathered usingeye tracking devices such as devices provided by Tobii (e.g., Tobii T60eye tracker). An eye-tracking device such as the Tobii T60 eye trackeris capable of capturing information related to the saccadic eye activityincluding location of fixation points, fixation durations, and otherdata related to a scene or stimulus image, such as a webpage forexample, while the user views the scene. Such an exemplary userinterface is described in greater detail below with reference to FIG. 4The Tobii T60 uses infrared light sources and cameras to gatherinformation about the user's eye movements while viewing a scene.

The path data may be received in various formats, for example, dependingupon the source of the data. In one embodiment and regardless of itsexact source and/or format, path data 360 received by system 300 may bestored in data store 350 for further processing.

Path data 360 received by system 300 from any or all of these sourcescan comprise data related to a path or plurality of paths to be analyzedby system 300. Path data 360 for a path may comprise informationidentifying a sequence of points included in the path, and possiblyother path related information. For example, for a scanpath, path data360 may comprise information related to a sequence of fixation pointsdefining the scanpath. Path data 360 may optionally include otherinformation related to a scanpath such as the duration of each fixationpoint, inter-fixation angles, inter-fixation distances, etc. Additionaldetails of exemplary scanpaths as they relate to an exemplary stimulusimage are described below with reference to FIG. 4.

Path data analyzer 340 can be configured to process path data 360 and,for example, identify patterns within the path data. For example, pathdata analyzer 340 can receive a set of path data 360 representingmultiple scanpaths and can analyze these scanpaths to identify patterns,i.e., similar or matching portions therein. According to one embodiment,the path data analyzer can include a dotplot generator 380 and dotplotanalyzer 390. Dotplot generator 380 can be adapted to generate a dotplotsuch as illustrated in and describe below with reference to FIG. 5. Sucha dotplot can accept as input, or be generated based on sequencesrelated to each scanpath of the path data. Dotplot analyzer 390 canthen, based on the dotplot, identify patterns within the scanpaths. Forexample, dotplot analyzer 390 can perform a linear regression process onthe dots in the dotplot as described below with reference to FIG. 9 toidentify sequential matches between the paths or portions of the paths,i.e., between two or more sub-sequences of fixation points. In somecases, sequential matches between two or more scanpaths can be used togenerate a new scanpath, which can be thought of as an “aggregate” orrepresentative scanpath in that it represents matching tokens in bothscanpaths that occur in the same sequential order.

Path analysis system 300 can also include renderer 330. Renderer 330 canbe configured to receive the dotplot generated by dotplot generator 380and/or an output of dotplot analyzer 390 and provide, e.g., via userinterface 320, a display or other representation of the results. Forexample, renderer 330 may provide a graphical representation of thedotplot including an indication, e.g., highlighting, shading, coloring,etc. indicating portions containing matches or identified patterns.Additionally or alternatively, renderer 330 may provide a graphicalrepresentation of the scene or stimulus image for which the eye trackingdata was obtained with a representation of the aggregated scanpathspresented thereon as illustrated in and described in greater detailbelow with reference to FIG. 6.

As noted above, the path data 360, i.e., information regarding thefixation points and saccadic eye movements between the fixation points,may be gathered using eye tracking devices such as devices capable ofcapturing information related to the saccadic eye activity includinglocation of fixation points, fixation durations, and other data relatedto a scene or stimulus image while the user views the scene or image.Such a stimulus image can comprise, for example, a webpage or other userinterface which, based on analysis of various scanpaths may be evaluatedfor possible improvements to the format or layout thereof.

FIG. 4 illustrates an exemplary stimulus image of a user interface whichmay be used with embodiments of the present invention and a number ofexemplary scanpaths. It should be noted that this stimulus image anduser interface are provided for illustrative purposes only and are notintended to limit the scope of the present invention. Rather, any numberof a variety of different stimulus images, user interfaces, or meansand/or methods of obtaining a query sequence are contemplated andconsidered to be within the scope of the present invention.

In this example, the image, which can comprise for example a web page402 or other user interface of a software application, includes a numberof elements which each, or some of which, can be considered a particularregion of interest. For example, webpage 402 may be considered tocomprise multiple regions such as: A (page header), B (page navigationarea), C (page sidebar), D (primary tabs area), E (subtabs area), F(table header), G (table left), H (table center), I (table right), J(table footer), and K (page footer). Webpage 402 may be displayed on anoutput device such as a monitor and viewed by the user.

FIG. 4 also depicts exemplary scanpaths 400 and 404 representing eyemovements of one or more users while viewing the webpage 402 andobtained or captured by an eye tracking device as described above. Paths400 and 404 shows the movements of the users' eyes across the variousregions of page 402. The circles depicted in FIG. 4 represent fixationpoints. A fixation point marks a location in the scene where thesaccadic eye movement stops for a brief period of time while viewing thescene. In some cases, a fixation point can be represented by, forexample, a label or name identifying a region of interest of the page inwhich the fixation occurs. So for example, scanpath 400 depicted in FIG.4 may be represented by the following sequence of region names {H, D, G,F, E, D, I, H, H, J, J, J}.

The scanpath data gathered by an eye tracker can be used by embodimentsof the present invention to identify patterns within the path data. Forexample, a set of path data representing multiple scanpaths and can beanalyzed to identify patterns, i.e., similar or matching portionstherein. According to one embodiment, a dotplot can be generated thatrepresents matches between sequences related to each scanpath of thepath data. The dotplot can then be analyzed to identify patterns withinthe scanpaths.

FIG. 5 is a chart illustrating an exemplary dotplot for sequences ofdata according to one embodiment of the present invention. Generallyspeaking, a dotplot 500 such as illustrated in this example is agraphical technique for visualizing similarities within a sequence oftokens or between two or more concatenated sequences of tokens. Forexample, in one embodiment sequences of tokens may be formed fromscanpath data by substituting the name of a pre-defined region ofinterest on a stimulus image for each scanpath fixation on that image.Dotplot 500 can be created by listing one string or sequence,represented by and corresponding to the sequence of region of interestnames, on the horizontal axis 504 and on the vertical axis 502 of amatrix. Such a matrix is symmetric about a main upper-left tolower-right diagonal 506. Dots, e.g., 505, 510, and 515, can be placedin an intersecting cell of matching tokens. Additionally, these dotse.g., 505, 510, and 515, can be weighted to emphasize tokens that aremore likely to be meaningful for particular applications. For example,and according to one embodiment, tokens can be inverse-frequencyweighted to down-weight regions that are fixated extremely often or areotherwise trivial or uninteresting, making it easier to discover moresignificant eye movement patterns. This weighting can be shown on thedotplot 500 in color or shading and is illustrated in this example indots with light hatching, e.g., 505, dots with heavy hatching, e.g.,510, and solid dots, e.g., 515. While three levels of weighting areillustrated here for the sake of clarity, it should be noted thatembodiments of the present invention are not so limited. Similarly, itshould be noted that the dotplot 500 illustrated in this example issignificantly simplified for the sake of brevity and clarity but shouldnot be considered as limiting on the type or extent of the dataset thatcan be handled by embodiments of the present invention. Rather, itshould be understood that datasets for various implementations andembodiments and the corresponding dotplots can be extensive. Weightingcan be applied based on different considerations. For example, when alarge dataset, i.e., a large number of scanpaths, is analyzed resultingin a very large or complex dotplot, various tokens, i.e., fixationpoints, can be weighted based on their relative importance or interest.

As noted above, each token of the sequence of tokens represented in thedotplot 500 can correspond to a sequence of visual fixations within aset of regions of interest on a stimulus image. In such cases and asillustrated here, each token can comprise a region name identifying oneof a plurality of regions of interest of the stimulus image in which thecorresponding visual fixation is located. However, it should beunderstood that, in other embodiments, other identifiers can be used.For example, fixation duration, time between fixations, distance betweenfixations (a.k.a. saccade length), angles between fixations, etc. Itshould be understood that, while tokens comprising or representingregion names may be useful when graphing or displaying results as willbe described below with reference to FIG. 6, these other types of tokenscan be equally useful, even if not used for graphing or displayingresults, and are also considered to be within the scope of the presentinvention.

The dotplot 500 can be used to identify matches and reverse matchesbetween sequences of data points or tokens. Such sequences arerepresented in the dotplot 500 in this example by lines 520, 525, and530 through the dots of the particular sequence. For example, line 520represents the sequence of tokens “JIED.” Similarly, line 525 representsthe sequence “DEGDH” and line 530 represents the sequence “HDEG.”According to one embodiment, these sequences can be identified based online fitting processes such as various linear regression processesincluding but not limited to a process such as described below withreference to FIG. 9.

Stated another way, strings comprising tokens corresponding to theregion of interest in which a fixation point is detected can beconcatenated and cross-plotted in a dotplot 500, placing a dot inmatching rows and columns as illustrated in FIG. 5. The dotplot 500 cancontain both self-matching scanpath sub-matrices along the diagonal andcross-matching scanpath sub-matrices off the main diagonal. For exampleand as illustrated here, the dotplot can include sub-matrices 540, 545,550, and 555 in four quadrants of the dotplot 500 and separated here forillustrative purposes by bold vertical and horizontal lines 560 and 565.It should be understood that this example has a single distinctsub-matrix 540 because its input consists of just two sequences. Ingeneral, if a dotplot's input consists of N sequences, there will beN*(N−1)/2 distinct sub-matrices. Each cross-matching sub-matrix containsdots or points that correspond to the tokens that match between twoscanpaths. Note that although each cross-matching sub-matrix appearstwice, both in the upper right and again, transposed, in the lower left,each cross-matching sub-matrix need be examined only once to findmatches between all pairs of scanpaths as described below and in FIG. 9.

Matching sequences between the strings can be found, for example, byfitting linear regression lines through filled cells. For example, theisolated sub-matrix 540 illustrated in FIG. 5 shows that three patternswere located: (1) line 525 “DEGDH”, a matching pattern relationship fromfixating the regions of interest (D) Primary Tabs, (E) Subtabs, (G)Table Left, (D) Primary Tabs, then (H) Table Center of the stimulusimage of FIG. 4; (2) line 530 “HDEG”, a reverse match from movingbetween the regions of interest (H) Table Center, (D) Primary Tabs, (E)Subtabs, and (G) Table Left; and (3) line 520 “JIED”, a second reversematch moving vertically along the right side of the page, i.e., (J)Table Footer (I) Table Right (E) Subtabs and (D) Primary Tabs of thestimulus image of FIG. 4.

It should be understood that such a dotplot 500 can be used to representany variety of different types of data. For example, the data canrepresent protein, DNA, and RNA sequences and the dotplot 500 can beused to identify insertions, deletions, matches, and reverse matches inthe data. In another example, the data can represent text sequences andthe dotplot can be used to identify the matching sequences inliterature, detect plagiarism, align translated documents, identifycopied computer source code, etc. According to one embodiment, thedataset can represent eye tracking data, i.e., data obtained from asystem for tracking the movements of a human eye. In such cases, tokenscan represent fixation points, e.g., on particular regions of intereston a user interface, and the sequences can represent scanpaths ormovements of the eye between the regions.

Regardless of exactly what type dataset is used, embodiments describedherein can include identifying patterns of sequential matches within thescanpaths or portions of the paths. In some cases, two scanpaths can beaggregated, i.e., an aggregated scanpath can be generated, based on andrepresenting the identified patterns. As noted above, once patterns havebeen identified within the scanpaths and two scanpaths are aggregated torepresent the identified pattern(s), a representation of the results canbe provided. For example, a graphical representation of the dotplot 500can be provided including an indication, e.g., highlighting, shading,coloring, etc. of portions containing matches or identified patterns.Additionally or alternatively, a graphical representation of the sceneor stimulus image for which the eye tracking data was obtained can beprovided. In such a case, the representation of the stimulus image caninclude a representation or indication of the aggregated scanpath(s),for example, displayed with or overlaid on the stimulus image.

FIG. 6 illustrates an exemplary stimulus image with matching scanpathsdisplayed thereon according to one embodiment of the present invention.In this example, the image comprises the web page 402 described abovewith reference to FIG. 4. That is, the stimulus image can be displayedand the patterns identified can be indicated thereon. These patterns canbe presented or visualized on the stimulus image as a set of nodes,e.g., 660 and 665 and arcs connecting defined regions of the stimulusimage.

So for example, FIG. 6 includes line 670 corresponding to line 525 ofFIG. 5, i.e., a matching pattern relationship from fixations within theregions of interest (D) Primary Tabs, (E) Subtabs, (G) Table Left, (D)Primary Tabs, then (H) Table Center of the stimulus image. Similarly,line 675 of FIG. 6 corresponds to line 530 of FIG. 5, i.e., the reversematch from moving between the regions of interest (H) Table Center, (D)Primary Tabs, (E) Subtabs, and (G) Table Left. Also, line 680 of FIG. 6corresponds to line 520 of FIG. 5, i.e., a second reverse match movingvertically along the right side of the page, i.e., (J) Table Footer (I)Table Right (E) Subtabs and (D) Primary Tabs.

FIG. 7 is a flowchart illustrating a process for analyzing eye trackingdata according to one embodiment of the present invention. In thisexample, the process can begin with receiving 710 the eye tracking data.As described above, the eye tracking data can comprise a plurality ofscanpaths, each of the plurality of scanpaths representing a sequence offixations in various regions of interest on a stimulus image. A dotplotrepresenting matches between each of the scanpaths can be generated 715.Generating 715 the dotplot can comprise generating a sequence of tokenscorresponding to the sequence of visual fixations. For example, eachtoken of the sequence of tokens corresponding to the sequence of visualfixations can comprise a region name identifying one of a plurality ofregions of interest of the stimulus image in which the correspondingvisual fixation is located. The dotplot can then be generated using thesequence of tokens.

One or more patterns can be identified 720 within the eye tracking databased on the dotplot. According to one embodiment, identifying 720 oneor more patterns can comprise identifying linear relationships withinthe plurality of scanpaths. For example, identifying linearrelationships within the plurality of scanpaths can be based on a linearregression process such as described below with reference to FIG. 9.

Two scanpaths of the plurality of scanpaths can be aggregated 725 basedon the identified one or more patterns. In some cases, a representationof the aggregated two or more scanpaths can be presented 730. Forexample, in the case of path data associated with spatial positions,such as eye-tracking data, the representation of the aggregatedscanpaths can comprise a graphical representation of the stimulus imagesuch as illustrated in and described above with reference to FIG. 6. Asdescribed above, the representation can also include an indication ofthe aggregated two or more scanpaths wherein nodes or points on theimage represent fixations within a region of interest. Lines connectingthe nodes can represent the aggregated scanpath.

FIG. 8 is a block diagram illustrating an exemplary softwarearchitecture for implementing an eye tracking data analysis processaccording to one embodiment of the present invention. Such anarchitecture can be implemented, for example, on a path analysis systemsuch as system 300 described above with reference to FIG. 3 or anothersystem as described above with reference to FIG. 2, or multiple systemsover a network as described above with reference to FIG. 1. It shouldalso be noted that this architecture is described here for illustrativepurposes only and is not intended to limit the scope or the presentinvention. Rather, other architectures are thought to also be suitablefor implementing embodiments of the invention described herein.

This example illustrates a three-tier web application architecturecomprising a client objects tier 810, a middle-tier services tier 820,and a back-end services tier 830. Client objects tier 810 can compriseobjects 811-815 for implementing a user interface and other client-sidemodules or components including but not limited to a query object 811, aresults table object 812, a dotplot object 813, a scanpath/screenshotobject 814, and a graphical results object 815. Middle-tier servicestier 820 can comprises a number of services 821-824 of interfacing theclient objects 810 with the back-end service. The service 821-824 of themiddle-tier services tier 820 can include but are not limited to adotplot service 821, a region service 822, a graphical results service823, and a Structured Query Language (SQL) service 824. Back-endservices tier 830 can comprise a system maintaining a database 831 orother repository of data such as scanpath data.

In operation, client objects 810 such as the query object 811 can beused for specifying a query and transmitting the query 835 to themiddle-tier services tier 820. The query 835 can invoke servlets, e.g.,SQL service 824, that send an SQL or other query 845 to query a database831 or repository of the back-end services 830 and process the results850. For example, dotplot service 821 can compute a dotplot forretrieved scanpath data as described above, region service 822 candetermine regions of interests associated with fixations in thescanpaths, graphical results service 823 can generate representations ofthe results as described above. The services 821-824 of the middle tierservices tier 820 can return such results to the client objects 810.

The client objects 810 can then present menus and text boxes to specifysubsets of the eye tracking datasets, as well as parameters that arepassed to the dotplot object 813. Query results 840 from the middle-tierservice tier 820 can be displayed in the client browser with the resulttable object 812, dotplot object 813, scanpath object 814, and graphicalresults object 815.

Regardless of the type of hardware and/or software used, embodiments ofthe present invention provide for analyzing sequential data representingpaths such as eye tracking data. The eye tracking data or othersequential data comprising an ordered set of tokens representing pathscan be analyzed, for example, to find sequential patterns or commonalitybetween the scanpaths. Generally speaking, a dotplot can be generated asdescribed above based on the tokens and can represents matches betweeneach path of the plurality paths. One or more patterns within thesequential data can then be identified based on the dotplot. Forexample, linear relationships in the dotplot can be detected usingleast-squares regression. Weighted or un-weighted regression may beconducted directly on the weighted data mentioned above. An exemplaryalgorithm for identifying patterns in the dotplot can be outlined asfollows:

-   -   1. Start with an inverse-frequency weighted dotplot sub-matrix        comparing two string sequences. Dots in the sub-matrix        correspond to matching tokens in the corresponding        sub-sequences. A high-pass threshold can be applied to the dots        to determine an initial set of points to use for the regression,        e.g., a threshold of 0.1-0.5, using a criterion of 1μ+1σ.    -   2. Fit a linear regression to the points. If R² value is too low        (e.g., under 0.5), there is no evidence for a line. Return to        Step 1, and continue to the next dotplot sub-matrix.    -   3. Determine the distribution of distances of the points to the        line, e.g. compute 1μ+1σ criterion from Euclidean distances        between the regression line and the data points.    -   4. Re-compute the regression, using data that are close to the        regression line (e.g., within 1μ+1σ).    -   5. Identified points represent sequential matches (negative        slopes) or reverse sequential matches (positive slopes). Output        the sequential matches, and remove the identified points from        the original set of filtered points. Return to Step 2, to        possibly locate another linear relationship in the remaining        points.

FIG. 9 is a flowchart illustrating a linear regression process foridentifying patterns within a dotplot of sequential data according toone embodiment of the present invention. In this example, processingbegins with determining 910 a dotplot sub-matrix comparing the tokens oftwo scanpaths. Points in the sub-matrix correspond 915 to matchingtokens in the corresponding sub-sequences. A high-pass threshold can beapplied 920 to the points to determine an initial set of points to usefor the regression (e.g., a threshold of 0.1-0.5, using a criterion of1μ+1σ) and a linear regression line can be fitted 925 to the filteredpoints. A determination can be made 930 as to whether there issufficient statistical evidence for a line. For example, if 930 the R²value is too low (e.g., under 0.5), there is no evidence for a line. Inresponse to determining 930 that no line exists, processing can continuewith another dotplot sub-matrix, if any, i.e., by returning todetermining points in the next sub-matrix 915.

In response to determining 930 that a line exists within the filteredpoints, variance criterion (1μ+1σ) can be computed 935 based onEuclidean distances between the regression line and the filtered points.The set of points can then be further filtered 940 to those within thevariance criterion, i.e., within 1μ+1σ. The linear regression line canbe recomputed 945 to better fit the remaining points. Informationdescribing the new regression line (e.g. its slope, Y-intercept, andconstituent points) can be output 950. Points identified as havinglinear relationships can be removed 955 from the set of points. In somecases, identifying one or more patterns can further comprise identifyinganother linear relationship within the points, if any, by repeating saidfitting 925 a linear regression line to the filtered points, determining930 whether another sequential match exists within the filtered pointsbased on a fit of the linear regression line, computing 935 variancecriterion from Euclidean distances between the regression line and thefiltered points, further filtering 940 the points to those within thevariance criterion, re-computing 945 the linear regression line usingthe filtered points within the variance criterion, outputtinginformation about the sequential match 950, and removing 955 the pointsidentified as having linear relationships from the set of points untilno points remain in the set of points or no matches exist, i.e., the R²value is too low for the remaining points.

In the foregoing description, for the purposes of illustration, methodswere described in a particular order. It should be appreciated that inalternate embodiments, the methods may be performed in a different orderthan that described. It should also be appreciated that the methodsdescribed above may be performed by hardware components or may beembodied in sequences of machine-executable instructions, which may beused to cause a machine, such as a general-purpose or special-purposeprocessor or logic circuits programmed with the instructions to performthe methods. These machine-executable instructions may be stored on oneor more machine readable mediums, such as CD-ROMs or other type ofoptical disks, floppy diskettes, ROMs, RAMs, EPROMs, EEPROMs, magneticor optical cards, flash memory, or other types of machine-readablemediums suitable for storing electronic instructions. Alternatively, themethods may be performed by a combination of hardware and software.

While illustrative and presently preferred embodiments of the inventionhave been described in detail herein, it is to be understood that theinventive concepts may be otherwise variously embodied and employed, andthat the appended claims are intended to be construed to include suchvariations, except as limited by the prior art.

1. A method for analyzing eye tracking data, the method comprising:receiving the eye tracking data at a path analysis system, wherein theeye tracking data comprises a plurality of scanpaths, each of theplurality of scanpaths representing a sequence of visual fixations on astimulus image; generating, with the path analysis system, a sequence oftokens corresponding to the sequence of visual fixations; generating adotplot, with the path analysis system, using the sequence of tokens;and identifying, with the path analysis system, one or more patterns ofsequentially matching tokens within the eye tracking data based on thedotplot.
 2. The method of claim 1, wherein each token of the sequence oftokens corresponding to the sequence of visual fixations comprises aregion name identifying one of a plurality of regions of interest of thestimulus image in which the corresponding visual fixation is located. 3.The method of claim 1, wherein identifying one or more patternscomprises identifying linear relationships within the plurality ofscanpaths.
 4. The method of claim 3, wherein identifying linearrelationships within the plurality of scanpaths is based on a linearregression.
 5. The method of claim 1, further comprising aggregating twoor more scanpaths of the plurality of scanpaths with the path analysissystem based on the identified one or more patterns.
 6. The method ofclaim 5, further comprising displaying a representation of theaggregated two or more scanpaths with the path analysis system.
 7. Themethod of claim 6, wherein the representation of the aggregated two ormore scanpaths comprises a graphical representation of the stimulusimage including an indication of the aggregated two or more scanpaths.8. A system for analyzing eye tracking data, the system comprising: aprocessor; and a memory communicatively coupled with and readable by theprocessor and having stored therein a series of instructions which, whenexecuted by the processor, cause the processor to receive the eyetracking data, wherein the eye tracking data comprises a plurality ofscanpaths, each of the plurality of scanpaths representing a sequence ofvisual fixations on a stimulus image, generate a sequence of tokenscorresponding to the sequence of visual fixations, generate a dotplotusing the sequence of tokens, and identify one or more patterns ofsequentially matching tokens within the eye tracking data based on thedotplot.
 9. The system of claim 8, wherein each token of the sequence oftokens corresponding to the sequence of visual fixations comprises aregion name identifying one of a plurality of regions of interest of thestimulus image in which the corresponding visual fixation is located.10. The system of claim 8, wherein identifying one or more patternscomprises identifying linear relationships within the plurality ofscanpaths.
 11. The system of claim 10, wherein identifying linearrelationships within the plurality of scanpaths is based on a linearregression.
 12. The system of claim 8, wherein the instructions furthercause the processor to aggregate two or more scanpaths of the pluralityof scanpaths based on the identified one or more patterns.
 13. Thesystem of claim 12, wherein the instructions further cause the processorto display a representation of the aggregated two or more scanpaths. 14.The system of claim 13, wherein the representation of the aggregated twoor more scanpaths comprises a graphical representation of the stimulusimage including an indication of the aggregated two or more scanpaths.15. A machine-readable medium having stored thereon a series ofinstructions which, when executed by a processor, cause the processor toanalyze eye tracking data by: receiving the eye tracking data, whereinthe eye tracking data comprises a plurality of scanpaths, each of theplurality of scanpaths representing a sequence of visual fixations on astimulus image; generating a sequence of tokens corresponding to thesequence of visual fixations; generating a dotplot using the sequence oftokens; and identifying one or more patterns of sequentially matchingtokens within the eye tracking data based on the dotplot.
 16. Themachine-readable medium of claim 15, wherein each token of the sequenceof tokens corresponding to the sequence of visual fixations comprises aregion name identifying one of a plurality of regions of interest of thestimulus image in which the corresponding visual fixation is located.17. The machine-readable medium of claim 15, wherein identifying one ormore patterns comprises identifying linear relationships within theplurality of scanpaths.
 18. The machine-readable medium of claim 17,wherein identifying linear relationships within the plurality ofscanpaths is based on a linear regression.
 19. The machine-readablemedium of claim 15, further comprising aggregating two or more scanpathsof the plurality of scanpaths based on the identified one or morepatterns.
 20. The machine-readable medium of claim 19, furthercomprising displaying a representation of the aggregated two or morescanpaths.
 21. The machine-readable medium of claim 20, wherein therepresentation of the aggregated two or more scanpaths comprises agraphical representation of the stimulus image including an indicationof the aggregated two or more scanpaths.